decision error
A Fenchel-Young Loss Approach to Data-Driven Inverse Optimization
Li, Zhehao, Wu, Yanchen, Mao, Xiaojie
Data-driven inverse optimization seeks to estimate unknown parameters in an optimization model from observations of optimization solutions. Many existing methods are ineffective in handling noisy and suboptimal solution observations and also suffer from computational challenges. In this paper, we build a connection between inverse optimization and the Fenchel-Young (FY) loss originally designed for structured prediction, proposing a FY loss approach to data-driven inverse optimization. This new approach is amenable to efficient gradient-based optimization, hence much more efficient than existing methods. We provide theoretical guarantees for the proposed method and use extensive simulation and real-data experiments to demonstrate its significant advantage in parameter estimation accuracy, decision error and computational speed.
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Transportation (0.46)
- Energy > Power Industry (0.45)
Threshold Learning for Optimal Decision Making
Decision making under uncertainty is commonly modelled as a process of competitive stochastic evidence accumulation to threshold (the drift-diffusion model). However, it is unknown how animals learn these decision thresholds. We examine threshold learning by constructing a reward function that averages over many trials to Wald's cost function that defines decision optimality. These rewards are highly stochastic and hence challenging to optimize, which we address in two ways: first, a simple two-factor reward-modulated learning rule derived from Williams' REINFORCE method for neural networks; and second, Bayesian optimization of the reward function with a Gaussian process. Bayesian optimization converges in fewer trials than REINFORCE but is slower computationally with greater variance. The REINFORCE method is also a better model of acquisition behaviour in animals and a similar learning rule has been proposed for modelling basal ganglia function.
- Europe > United Kingdom > England > Bristol (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Electricity Price Prediction for Energy Storage System Arbitrage: A Decision-focused Approach
Sang, Linwei, Xu, Yinliang, Long, Huan, Hu, Qinran, Sun, Hongbin
Electricity price prediction plays a vital role in energy storage system (ESS) management. Current prediction models focus on reducing prediction errors but overlook their impact on downstream decision-making. So this paper proposes a decision-focused electricity price prediction approach for ESS arbitrage to bridge the gap from the downstream optimization model to the prediction model. The decision-focused approach aims at utilizing the downstream arbitrage model for training prediction models. It measures the difference between actual decisions under the predicted price and oracle decisions under the true price, i.e., decision error, by regret, transforms it into the tractable surrogate regret, and then derives the gradients to predicted price for training prediction models. Based on the prediction and decision errors, this paper proposes the hybrid loss and corresponding stochastic gradient descent learning method to learn prediction models for prediction and decision accuracy. The case study verifies that the proposed approach can efficiently bring more economic benefits and reduce decision errors by flattening the time distribution of prediction errors, compared to prediction models for only minimizing prediction errors.
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (11 more...)
- Energy > Power Industry (1.00)
- Energy > Energy Storage (1.00)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming
In deterministic optimization, it is typically assumed that all problem parameters are fixed and known. In practice, however, some parameters may be a priori unknown but can be estimated from historical data. A typical predict-then-optimize approach separates predictions and optimization into two stages. Recently, end-to-end predict-then-optimize has become an attractive alternative. In this work, we present the PyEPO package, a PyTorchbased end-to-end predict-then-optimize library in Python. To the best of our knowledge, PyEPO (pronounced like pineapple with a silent "n") is the first such generic tool for linear and integer programming with predicted objective function coefficients. It provides four base algorithms: a convex surrogate loss function from the seminal work of Elmachtoub and Grigas [16], a differentiable black-box solver approach of Pogancic et al. [35], and two differentiable perturbation-based methods from Berthet et al. [6]. PyEPO provides a simple interface for the definition of new optimization problems, the implementation of state-of-the-art predict-then-optimize training algorithms, the use of custom neural network architectures, and the comparison of end-to-end approaches with the two-stage approach. PyEPO enables us to conduct a comprehensive set of experiments comparing a number of end-to-end and two-stage approaches along axes such as prediction accuracy, decision quality, and running time on problems such as Shortest Path, Multiple Knapsack, and the Traveling Salesperson Problem. We discuss some empirical insights from these experiments, which could guide future research. PyEPO and its documentation are available at https://github.com/khalil-research/PyEPO.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- Energy (0.46)
- Leisure & Entertainment > Games (0.46)
- Transportation (0.34)
Efficient LSTM Training with Eligibility Traces
Hoyer, Michael, Eivazi, Shahram, Otte, Sebastian
Training recurrent neural networks is predominantly achieved via backpropagation through time (BPTT). However, this algorithm is not an optimal solution from both a biological and computational perspective. A more efficient and biologically plausible alternative for BPTT is e-prop. We investigate the applicability of e-prop to long short-term memorys (LSTMs), for both supervised and reinforcement learning (RL) tasks. We show that e-prop is a suitable optimization algorithm for LSTMs by comparing it to BPTT on two benchmarks for supervised learning. This proves that e-prop can achieve learning even for problems with long sequences of several hundred timesteps. We introduce extensions that improve the performance of e-prop, which can partially be applied to other network architectures. With the help of these extensions we show that, under certain conditions, e-prop can outperform BPTT for one of the two benchmarks for supervised learning. Finally, we deliver a proof of concept for the integration of e-prop to RL in the domain of deep recurrent Q-learning.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.15)
- North America > Canada > Ontario > Toronto (0.14)
Threshold Learning for Optimal Decision Making
Decision making under uncertainty is commonly modelled as a process of competitive stochastic evidence accumulation to threshold (the drift-diffusion model). However, it is unknown how animals learn these decision thresholds. We examine threshold learning by constructing a reward function that averages over many trials to Wald's cost function that defines decision optimality. These rewards are highly stochastic and hence challenging to optimize, which we address in two ways: first, a simple two-factor reward-modulated learning rule derived from Williams' REINFORCE method for neural networks; and second, Bayesian optimization of the reward function with a Gaussian process. Bayesian optimization converges in fewer trials than REINFORCE but is slower computationally with greater variance. The REINFORCE method is also a better model of acquisition behaviour in animals and a similar learning rule has been proposed for modelling basal ganglia function.
- Europe > United Kingdom > England > Bristol (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Education (0.85)
- Health & Medicine (0.67)
Convergence of a Neural Network Classifier
Baras, John S., LaVigna, Anthony
In this paper, we prove that the vectors in the LVQ learning algorithm converge. We do this by showing that the learning algorithm performs stochastic approximation. Convergence is then obtained by identifying the appropriate conditions on the learning rate and on the underlying statistics of the classification problem. We also present a modification to the learning algorithm which we argue results in convergence of the LVQ error to the Bayesian optimal error as the appropriate parameters become large.
- North America > United States > Maryland > Prince George's County > College Park (0.15)
- North America > United States > Texas (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Convergence of a Neural Network Classifier
Baras, John S., LaVigna, Anthony
In this paper, we prove that the vectors in the LVQ learning algorithm converge. We do this by showing that the learning algorithm performs stochastic approximation. Convergence is then obtained by identifying the appropriate conditions on the learning rate and on the underlying statistics of the classification problem. We also present a modification to the learning algorithm which we argue results in convergence of the LVQ error to the Bayesian optimal error as the appropriate parameters become large.
- North America > United States > Maryland > Prince George's County > College Park (0.15)
- North America > United States > Texas (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)